YASS: Similarity search in DNA sequences
نویسندگان
چکیده
We describe YASS – a new tool for finding local similarities in DNA sequences. The YASS algorithm first scans the sequence(s) and creates on the fly groups of seeds (small exact repeats obtained by hashing) according to statistically-founded criteria. Then it tries to extend those groups into similarity regions on the basis of a new extension criterion. The method can be seen as a compromise between single-seed (BLAST) and multiple-seed (FASTA, BLAT) approaches, and achieves a gain in both sensitivity and selectivity. The method is flexible and can be made more efficient by using spaced seeds, and in particular transitionconstrained spaced seeds. We provide examples of applying YASS to Saccharomyces Cerevisiae and Drosophila Melanogaster chromosomes. Key-words: YASS, local alignment, spaced seeds, transitions YASS: Recherche de similaritées dans les séquences d’ADN Résumé : Nous présentons YASS – un nouvel outil par la recherche locale de similaritées dans les séquences d’ADN. L’algorithme de YASS parcours la séquence dans un premier temps, et crée des groupes de graines (petites répétitions exactes obtenues par hachage) selon des critères reposant sur des propriétées statistiques. Dans un deuxième temps, il essaie d’étendre ces groupes en régions de similaritées selon un nouveau critère d’extension. La methode proposée peut être vue commme un compromis entre les stratégies à une seule graine (BLAST) et celles à multiples graines (FASTA, BLAT), elle atteind des gains à la fois sur la sensibilitée et la selectivité. La méthode reste flexible et peut être rendue encore plus efficace en utilisant des graines espacées, particulièrement en considérant des graines espacées contenant des elements spécifiques contraints aux transitions. Nous donnons des examples d’utilisation de YASS sur des chromosomes de Saccharomyces Cerevisiae et Drosophila Melanogaster. Mots-clés : YASS, alignement local, graines espacées, transitions YASS: Similarity search in DNA sequences 3
منابع مشابه
YASS: enhancing the sensitivity of DNA similarity search
YASS is a DNA local alignment tool based on an efficient and sensitive filtering algorithm. It applies transition-constrained seeds to specify the most probable conserved motifs between homologous sequences, combined with a flexible hit criterion used to identify groups of seeds that are likely to exhibit significant alignments. A web interface (http://www.loria.fr/projects/YASS/) is available ...
متن کاملDevelopment of an Efficient Hybrid Method for Motif Discovery in DNA Sequences
This work presents a hybrid method for motif discovery in DNA sequences. The proposed method called SPSO-Lk, borrows the concept of Chebyshev polynomials and uses the stochastic local search to improve the performance of the basic PSO algorithm as a motif finder. The Chebyshev polynomial concept encourages us to use a linear combination of previously discovered velocities beyond that proposed b...
متن کاملIndexing DNA Sequences Using q-Grams
We have observed in recent years a growing interest in similarity search on large collections of biological sequences. Contributing to the interest, this paper presents a method for indexing the DNA sequences efficiently based on q-grams to facilitate similarity search in a DNA database and sidestep the need for linear scan of the entire database. Two level index – hash table and c-trees – are ...
متن کاملEstimating the Redundancy Factor for RA-encoded sequences and also Studying Steganalysis Performance of YASS
Our recently introduced JPEG steganographic method called Yet Another Steganographic Scheme (YASS) can resist blind steganalysis by embedding data in the discrete cosine transform (DCT) domain in randomly chosen image blocks. To maximize the embedding rate for a given image and a specified attack channel, the redundancy factor used by the repeat-accumulate (RA) code based error correction frame...
متن کاملThe Investigation of Mutations and Comparison of Leptin Gene Pro-Motor in Najdi Cattle with the Database NCBI Sequences
Objective: Identity the genetic aspects and major gene influence on energy balance, milk production, fertility, food safety and consumer are the recent interests of genetic and breeding researchers. Methods: Najdi Cattle is the most prominent breeds in Khuzestan province. To do this plan in Shoushtar Najdi Cattle Station, blood samples were taken from 15 Najdi Cattles. DNA was extracted from wh...
متن کامل